Prediction in a small-sized sample with a large number of covariates, the"small n, large p" problem, is challenging. This setting is encountered inmultiple applications, such as precision medicine, where obtaining additionalsamples can be extremely costly or even impossible, and extensive researcheffort has recently been dedicated to finding principled solutions for accurateprediction. However, a valuable source of additional information, domainexperts, has not yet been efficiently exploited. We formulate knowledgeelicitation generally as a probabilistic inference process, where expertknowledge is sequentially queried to improve predictions. In the specific caseof sparse linear regression, where we assume the expert has knowledge about thevalues of the regression coefficients or about the relevance of the features,we propose an algorithm and computational approximation for fast and efficientinteraction, which sequentially identifies the most informative features onwhich to query expert knowledge. Evaluations of our method in experiments withsimulated and real users show improved prediction accuracy already with a smalleffort from the expert.
展开▼